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Abstract 



In many typical mobile communication receivers the channel is estimated based on pilot symbols to allow for 
, a coherent detection and decoding in a separate processing step. Currently much work is spent on receivers which 

■ break up this separation, e.g., by enhancing channel estimation based on reliability information on the data symbols. 

I \ In the present work, we evaluate the possible gain of a joint processing of data and pilot symbols in comparison to 

the case of a separate processing in the context of stationary Rayleigh flat-fading channels. Therefore, we discuss the 
nature of the possible gain of a joint processing of pilot and data symbols. We show that the additional information 
that can be gained by a joint processing is captured in the temporal correlation of the channel estimation error of the 
solely pilot based channel estimation, which is not retrieved by the channel decoder in case of separate processing. 
In addition, we derive a new lower bound on the achievable rate for joint processing of pilot and data symbols. 

C/3 ■ 

Q ■ Index Terms 



Channel capacity, fading channels, information rates, joint processing, mismatched decoding, noncoherent, 
Rayleigh, time-selective. 

I. Introduction 
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T IRTUALLY all practical mobile communication systems face the problem that communication takes 
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CN y place over a time varying fading channel whose realization is unknown to the receiver. However, for 
5 coherent detection and decoding an estimate of the channel fading process is required. For the purpose of 
^ channel estimation usually pilot symbols, i.e., symbols which are known to the receiver, are introduced 
j>! ■ into the transmit sequence. In conventional receiver design the channel is estimated based on these 
pilot symbols. Based on these channel estimates, in a separate step coherent detection and decoding 
is performed. Both processing steps are executed separately, 
"^j In recent years, much effort has been spent on the study of iterative joint channel estimation and 
decoding schemes, i.e., schemes, in which the channel estimation is iteratively enhanced based on reliability 
information on the data symbols delivered by the decoder, see, e.g., [Hl-lllll- In this context, the channel 
estimation is not solely based on pilot symbols, but also on data symbols. This approach is an instance of 
a joint processing of data and pilot symbols in contrast to the separate processing in conventional receiver 
design. Obviously, this joint processing results in an increased receiver complexity. To evaluate the payoff 
for the increased receiver complexity, it is important to study the possible performance gain that can be 
achieved by a joint processing, e.g., in form of an iterative code-aided channel estimation and decoding 
based receiver, in comparison to a separate processing as it is performed in conventional synchronized 
detection based receivers, where the channel estimation is solely based on pilot symbols. 
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Therefore, in the present work we evaluate the performance of a joint processing in comparison to 
synchronized detection with a solely pilot based channel estimation based on the achievable rate. Regarding 
the channel statistics, we assume a stationary Rayleigh flat-fading channel as it is usually applied to model 
the fading in a mobile environment without a line of sight component. Furthermore, we assume that the 
power spectral density (PSD) of the channel fading process is compactly supported, and that the fading 
process is non-regular Q, which is reasonable as the maximum Doppler frequency of typical fading 
channels is small in comparison to the inverse of the symbol duration. Furthermore, we assume that the 
receiver is aware of the law of the channel, while neither the transmitter nor the receiver knows the 
realization of the channel fading process. 

There has been a variety of publications studying the achievable rate with pilot symbols, see, e.g., 
||6l- lfT2]| . Many of these works discuss the achievable rate under the assumption that a channel estimate is 
acquired based on pilot symbols which is then used for coherent detection, i.e., separate processing. Some 
of these works consider block-fading, [|71, [[TO). and flT], while [8] and [9] specifically discuss the case 
of stationary fading. For the case of a stationary single-input single-output Rayleigh flat-fading channel, 
as we study in the present work, tight bounds on the achievable rate with synchronized detection with 
a solely pilot based channel estimation, i.e., separate processing, have been given in [8|. In contrast, for 
the case of a joint processing there is not much knowledge on the achievable rate. Very recently, in [13] 
the value of joint processing of pilot and data symbols has been studied in the context of a block-fading 
channel. To the best of our knowledge, there are no results concerning the gain of joint processing of 
pilot and data symbols for the case of stationary fading channels. Thus, in the present work, we study the 
achievable rate with a joint processing of pilot and data symbols. We identify the nature of the possible 
gain of a joint processing of pilot and data symbols in comparison to a separate processing. Furthermore, 
we derive a lower bound on the achievable rate with joint processing of pilot and data symbols, which, 
thus, can be seen as an extension of the work given in [[T3l to the case of stationary Rayleigh flat-fading. 
In addition, we compare the given lower bound on the achievable rate with joint processing of pilot and 
data symbols to bounds on the achievable rate with separate processing given in [[81 and to bounds on 
the achievable rate with i.i.d. zero-mean proper Gaussian input symbols given in [[141 . i.e., without the 
assumption on pilot symbols inserted into the transmit sequence. 

The rest of the paper is organized as follows. In Section |ll] the system model is introduced. Subsequently, 
in Section Hn] we discuss the nature of the gain by a joint processing of pilot and data symbols, i.e., we 
discuss which information is discarded in case of a separate processing. Furthermore, existing bounds on 
the achievable rate with separate processing are briefly recalled. Afterwards, in Section |IV]a new lower 
bound on the achievable rate with a joint processing of pilot and data symbols is derived, before it is 
numerically evaluated and compared to the achievable rate with separate processing and to the achievable 
rate with i.i.d. zero-mean proper Gaussian inputs in Section |V| Finally, Section |Vl] concludes the paper 
with a brief summary. 



II. System Model 

We consider a discrete-time zero-mean jointly proper Gaussian flat-fading channel with the following 
input-output relation 

y = Hx + n = Xh + n (1) 

with the diagonal matrices H = diag(h) and X = diag(x). Here the diag(-) operator generates a diagonal 
matrix whose diagonal elements are given by the argument vector. The vector y = [yi, . . . ,yN-]^ contains 
the channel output symbols in temporal order. Analogous, x = [xi, . . . ,xn]^, n = [ni, . . . ,niv]^, and 
h = [hi, . . . , hj^]^ contain the channel input symbols, the additive noise samples and the channel fading 
weights. All vectors are of length A^. 

The samples of the additive noise process are assumed to be i.i.d. zero-mean jointly proper Gaussian 
with variance cx^ and, thus, R„ = E nn^ = ct^Iat, with I^v being the identity matrix of size N x N. 
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The channel fading process is zero-mean jointly proper Gaussian with the temporal correlation charac- 
terized by 

rhil)^E[hk+i-hl]. (2) 

Its variance is given by r/j(0) = cr^. For mathematical reasons we assume that the autocorrelation function 
rh{l) is absolutely summable, i.e., 

oo 

E \rhil)\<oo. (3) 

l= — OD 

The PSD of the channel fading process is defined as 

oo 

Shif)^ E r,(m)e-^-2--^ |/| < 0.5. (4) 

m=— oo 

We assume that the PSD exists, which for a jointly proper Gaussian fading process implies ergodicity. 
Furthermore, we assume the PSD to be compactly supported within the interval [— /d, fd] with fa being 
the maximum Doppler shift and < fd < 0.5. This means that Sh{f) = for / ^ [—fd,fd]- The 
assumption of a PSD with limited support is motivated by the fact that the velocity of the transmitter, the 
receiver, and of objects in the environment is limited. To ensure ergodicity, we exclude the case fd = 0. 
In matrix-vector notation, the temporal correlation is expressed by the autocorrelation matrix given 
by 

R;, = E [hh^] . (5) 

For the following derivation we introduce the subvectors xd containing all data symbols of x and the 
vector Xp containing all pilot symbols of x. Correspondingly, we define the vectors ho, hp, yn, yp, n^i, 
and rip. 

The transmit symbol sequence consists of data symbols with a maximal average power a^, i.e., 

^"'^^-^<al (6) 



-E 



with Nf) being the length of the vector x^j, and periodically inserted pilot symbols with a fixed transmit 
power al- Each L-th symbol is a pilot symbol. We assume that the pilot spacing is chosen such that the 
channel fading process is sampled at least with Nyquist rate, i.e.. 

The processes {xk}, {hk} and {uk} are assumed to be mutually independent. 
Based on the preceding definitions the average SNR p is given by 

(8) 



III. The Nature of the Gain by Joint Processing of Data and Pilot Symbols 

Before we quantitatively discuss the value of a joint processing of data and pilot symbols, we discuss 
the nature of the possible gain of such a joint processing in comparison to a separate processing of 
data and pilot symbols. The mutual information between the transmitter and the receiver is given by 
X(x£); y^i, yp, Xp). As the pilot symbols are known to the receiver, the pilot symbol vector xp is found 
at the RHS of the semicolon. We separate X(x£); y/j, yp, Xp) as follows 

2^(xD;yD,yp,xp) = X(xD;y/j|yp,xp) +X(xD;yp|xp) +X(xd;xp) 

^=^X(x^;yz5|yp,xp) (9) 

where (a) follows from the chain rule for mutual information and (b) holds due to the independency 
of the data and pilot symbols. The question is, which portion of X(x£); y^yp, xp) can be achieved by 
synchronized detection with a solely pilot based channel estimation, i.e., with separate processing. 
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A. Separate Processing 

The receiver has to find the most likely data sequence x/j based on the observation y while knowing 
the pilots xp, i.e., 

xd = arg max p{y\x.) = arg max p(yD|xz), yp, xp) (10) 

with the set Co containing all possible data sequences xp). It can be shown that the probability density 
function (PDF) p(yp|x£), yp, xp) is proper Gaussian and, thus, is completely described by the conditional 
mean and covariance 

E[yp|xp,,yp,xp] = XpE [ho|yp, xp] = Xphpu^p (11) 
cov[yp|xp, yp, xp] = XpRep„,DXg + ct^Iat^ (12) 

where Xp = diag(xp) and Iat^ is an identity matrix of size No x Nd. The vector hpn^p is an MMSE 
channel estimate at the data symbol time instances based on the pilot symbols, which is denoted by the 
index pil. Furthermore, the corresponding channel estimation error 

%i\,D = hp — hpii p (13) 

is zero-mean proper Gaussian and 

(14) 



H I 

epii,Depii^p|xp 



is its correlation matrix, which is independent of yp due to the principle of orthogonality. 

Based on (fTTI) and (fT2l) conditioning of yp on xp, yp, xp is equivalent to conditioning on xp, hpii,p, xp, 
i.e., 

p(yD|xp,yp,xp) =p(yp|xp,hpii,p,xp) (15) 

as all information on hp delivered by yp is contained in hpn^p while conditioning on xp. Thus, (flOl ) can 
be written as 

Xp = arg max ]9(yp|xp, hpj^p, xp) = arg max ]9(y|xp, hpn, xp). (16) 



For ease of notation in the following we will use the metric on the RHS of (1161) where hpii corresponds 
to hpii,p but also contains channel estimates at the pilot symbol time instances, i.e., 

hpii = E[h|yp,xp]. (17) 

Based on hpn, ([T]) can be expressed by 

y = X(hpii + epii) + n (18) 

where epii is the estimation error including the pilot symbol time instances. As the channel estimation is 
an interpolation, the error process is not white but temporally correlated, i.e.. 



epiiejjxp 



(19) 



is not diagonal, cf. (|35l) . As the estimation error process is zero-mean proper Gaussian, the PDF in (fT6] ) 
is given by 

p(y |xp, hpn, Xp) = CM (Xhp,, XRe^.,X^ + a^I^) (20) 

where CAf{fjb, C) denotes a proper Gaussian PDF with mean fj, and covariance C and where Iat is the 
N X N identity matrix Q 

'Note that for the case of data transmission only l l20t becomes p{y\x.D) = CA/'(0, XR^X^ + a^Ijv) as in this case hpu = and 
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Corresponding to (fTST ). we can also rewrite p(y£i|yp,xp) as follows 

p(yD|yp,xp) = J p(yi5|xz),yp,xp)p(xD|yp,xp)rfx£, 



- J p(yD|xD,hpii,D,xp)p(xp,)c/xp, 

= p(yz)|hpii,D,xp) (21) 

where for (a) we have used (fTST ) and the independency of x/j of xp and yp. 
Based on (fTSl) and (I2TI) . we can also rewrite (|9]) as 

X(xi3; yclyp, xp) = X(xd; yD|hpii, xp) = X(xd; yD|hpii) (22) 

and where (a) holds as the pilot symbols are deterministic. 

However, typical channel decoders like a Viterbi decoder are not able to exploit the temporal correlation 
of the channel estimation error. Therefore, the decoder performs mismatched decoding based on the 
assumption that the estimation error process is white, i.e., p(y|xpi, hpn, xp) is approximated by 

p(y|xB, hpn, Xp) ^ CAf (Xhpn, <, XX^ + a^I^v) . (23) 

As it is assumed that the channel is at least sampled with Nyquist frequency, see (|7]), for an infinite 
block length N oo the channel estimation error variance a"^^.^ is independent of the symbol time instant 
[|8l and is given by 

. 1 1 



■'J-'2 ■'■1 — 2 J—TT- + i 



'h 



where S(,^^^{f) is the PSD of the channel estimation error process in case the channel estimation is solely 
based on pilot symbols, which is given in (11021 ) in Appendix |Bl Hence, the variance of the channel 
estimation process, i.e., the entries of hpu, is given by cr^ — a1^.^^, which follows from the principle of 
orthogonality in LMMSE estimation. 

As the information contained in the temporal correlation of the channel estimation error is not retrieved 
by synchronized detection with a solely pilot based channel estimation, the mutual information in this 
case corresponds to the sum of the mutual information for each individual data symbol time instant. As, 
obviously, by this separate processing information is discarded, the following inequality for the achievable 
rate holds: 

lim ^X(xp,;yp,|hpii) = X'(xz); yp,|hpii) 

= ^ ^ ^ ^{xD^,]yDk\hpil,Dk) = ^sep (25) 

where X' denotes the mutual information rate and the index Dk refers to an arbitrarily chosen data symbol, 
i.e., xd^. = [xd]/,.- Furthermore, h^n^Dk is the solely pilot based channel estimate at the data symbol time 
instant D^. The pre-factor (L — 1)/L arises from the fact that each L-th symbol is a pilot symbol. In the 
following, we denote the achievable rate with separate processing by TZ^ep. 

As the LHS of (|25l) is the mutual information of the channel and as the RHS of (f25l) is the mutual 
information achievable with synchronized detection with a metric corresponding to and a solely 
pilot based channel estimation, i.e., a separate processing, the difference of both terms upper bounds the 
possible gain due to joint processing of data and pilot symbols. Obviously, the additional information that 
can be gained by a joint processing in contrast to the separate processing is contained in the temporal 
correlation of the channel estimation error process. 



6 



Regarding synchronized detection in combination with a solely pilot based channel estimation, i.e., the 
separate processing approach, in [8] bounds on the achievable rate have been given, which for zero-mean 
proper Gaussian data symbols become 



L-1. 



LjSep 



log 

7 



1 



al\h. 



Cpil ^ X ' ^ V 



I/, Sep 




(26) 



(27) 



Based on the lower bound in (|26l ) it can easily be seen that the achievable rate is decreased in comparison 
to perfect channel knowledge by two factors. First, symbol time instances that are used for pilot symbols 
are lost for data symbols leading to the pre-log factor and secondly, the average SNR is decreased by 



the factor 1 — 



due to the channel estimation error variance. The additional term in 



the upper bound in (|27l) arises from the fact that the effective noise, i.e., Cpn^Dk^Dk ~^^d^, is non-Gaussian. 
Here e^^ is the estimation error at the data symbol time instant D^, i.e., e^^ = [epn ^)]^. 

IV. Joint Processing of Data and Pilot Symbols 

Now, we give a new lower bound on the achievable rate for a joint processing of data and pilot 
symbols. The following approach can be seen as an extension of the work in [13] for the case of a block- 
fading channel to the stationary Rayleigh flat-fading scenario discussed in the present work. Therefore, 
analogously to lfT3l we decompose and lower-bound the mutual information between the transmitter and 
the receiver X(x/5; Yd, Yp, xp) as follows 

Yd, Yp, xp) = X(xp,; yd, Yp, xp, h) - X(xz); h|Yp., Yp, xp) 



(b) 



X(xp,; Yd, h) - /i(h| yd, Yp, xp) + /i(h|xp,, yd, Yp, xp) 



> X(xp,; Yd, h) - h{h\yp, xp) + /i(h|xp,, yd, Yp, Xp) (28) 

where (a) follows from the chain rule for mutual information. For the first term in (b) we have used 
the fact that due to the knowledge on h, the knowledge on yp and xp does not increase the mutual 
information between x/) and yd- Finally, (c) is due to the fact that conditioning reduces entropy. Note, 
the first term on the RHS of (|28|) is the mutual information in case of perfect channel knowledge. 

In the following we deviate from the derivation given in [13J. Now, we calculate both differential 
entropy terms at the RHS of (|28T ). Therefore, we rewrite the RHS of (l28l) as follows 

X(xp,; Yd, Yp, Xp) > X(xz); yd, h) - /i(h|Yp, xp) + /i(h|xz), yd, Yp, Xp) 
X(xp,; Yd, h) - h{h\\ii, xp) + /i(h|hjoint, xd, xp) 
X(xd; Yd, h) - /i(hpii + epnlhpn, xp) + /i(hjoint + ejoint|hjoint, xpi, xp) 



X(xp,; Yd, h) - /i(epii|xp) + h{e 



-joint 



XD,Xp 



X(xp); Yd, h) - Exp log det {jieR^ 
X(xp,; Yd, h) - log det (Repi,) + Ex^ log det (Re 



log det ( 



TreRp 



(29) 
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where for the second term in (a) we have substituted the condition on yp by hpii, which is possible as 
the estimate hpii contains the same information on h as yp while conditioning on xp. Corresponding to 
the solely pilot based channel estimate hpu, based on xp), xp, yr,, and yp, we can calculate the estimate 
hjoint, which is based on data and pilot symbols. Like hpu this estimate is a MAP estimate, which, due to 
the jointly Gaussian nature of the problem, is an MMSE estimate, i.e., 

hjoint = E [h|yp, Xp, yp, xp] . (30) 

Thus, for (a) we have substituted the conditioning on yp and yp by conditioning on hjoint in the third 
term, as hjoint contains all information on h that is contained in yp and yp while xp and xp are known. 
For equality (b) we have used for the second term that h can be expressed as a sum of its estimate hpii 
and the estimation error epii, cf. (fTSi) . Analogously, for the third term we used the separation of h into 
the estimate hjoint and the corresponding estimation error ejoint, i.e., 

hjoint = h — hjoint- (31) 

Equality (c) is due to the fact that the addition of a constant does not change differential entropy and 
that the estimation error Gpii is independent of the estimate hpii and analogously ejoint, which depends on 
Xp and Xp, is independent of hjoint due to the orthogonality principle in LMMSE estimation. Finally, (d) 
follows from the fact that the estimation error processes are zero-mean jointly proper Gaussian. Here the 
error correlation matrices are given by ([T9l and by 



H I 

SjointGjoint I i Xp 



(32) 



For (e) we have used that the pilot symbols are deterministic. Therefore, the expectation over xp in 
the second and third term can be removed. However, the channel estimation error ejoim depends on the 
distribution of the data symbols xp. Concerning the third term on the RHS of (|29] ). it can be shown that 
the differential entropy rate /i'(ejoint|xp, xp), i.e., 

/i'(ejoint|xp,xp) = lim ^/i(ejoint|xp, xp) (33) 

is minimized for a given average transmit power if the data symbols are constant modulus (CM) 
symbols with power a^, see Appendix \M Within this proof the restriction to an absolutely summable 
autocorrelation function r^^l), see (|3]), is required. 

Thus, based on (|29l ) a lower bound for the achievable rate with joint processing of data and pilot 
symbols is given by 

2^'(xp;yp,yp,xp) = lim ^X(xp; yp, yp, xp) 



- N^oo ^ {^(^^' yD^^)- log det (Re^.,) + log det (Rej„,„,,cM) } 



2 




= tl IfK-o; yo. h) - /_; log ( .pi^ ) df (34) 



with Rejo,„tcM corresponding to (132] ). but under the assumption of CM data symbols with transmit power 
cr^.. As Rejoin, CM oi^ly depends on the distribution of the magnitude of the data symbols contained in xp, 
which is constant and deterministic, we can remove the expectation operation with respect to xp. Note 
that the CM assumption has only been used to lower-bound the third term at the RHS of (|29l ). and not the 
whole expression at the RHS of (|29l) . For (a) in (|34] ) we have used Szego's theorem on the asymptotic 
eigenvalue distribution of Hermitian Toeplitz matrices [[T5l . Se^^^{f) and Se^^-^^^^^^f) are the PSDs of the 
channel estimation error processes, on the one hand, if the estimation is solely based on pilot symbols. 
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and on the other hand, if the estimation is based on data and pilot symbols, assuming CM data symbols. 
They are given by 

'^^p"('^) = P SM)\ . ^^^^ 

h 



The derivation of these PSDs is given in Appendix |Bl 

However, the application of Szego's theorem for (a) in (|34] ) requires several steps, which we discuss in 
the following. The limit over the second and the third term on the LHS of (a) in (|34|) can be transformed 
as follows 



iV-s.oo 

(a) 
(J 



Jim 1 {log det (FAe^.,F^) - log det (fA,^„.„,,,,F^) } 
lim ^llogdetfFAe .Al^ .„F^)| 



^e,,(/) 




(37) 

where for (a) we have substituted the Toeplitz matrices Rgpj, and Rej„i„,cM by their asymptotic equivalent cir- 
culant matrices Cep, and Cej„i„,cM' see [[T6l . Furthermore, for (b) we have used the spectral decompositions 
of the circulant matrices given by 

Cep, = FA,p„F^ (38) 

where Aep„ and Aej^i,,,^,^^ are diagonal matrices containing the eigenvalues of Cep^ and Cej„,„,cM' ^^d the 
matrix F is a unitary DFT-matrix whose elements are given by 

1 .„ (fe-i)(i-i) 



For (c) in (|37l ) we have then used Szego's theorem on the asymptotic eigenvalue distribution of Hermitian 
Toeplitz matrices [[T5l . Therefore, first consider that the matrix FAp ,A~^ F^ on the LHS of (c) is 
again a circulant matrix and that there exists an asymptotically equivalent Toeplitz matrix. Furthermore, 
the eigenvalues of Cgpi, are samples of the PSD Se^^Xf) '^he eigenvalues of Cej„i„,cM samples of 
the PSD 5'ej^i„,cM(/)- Here we assume a construction of the circulant matrices as described in [fT6l (4.32)], 
see also in Appendix |A] from (172] ) to (1761 ). Furthermore, the application of Szego's theorem requires that 
the log-function is continuous on the support of the eigenvalues of the matrix Ag ,A7^ . This means 
that we have to show that the eigenvalues of Ai.^-^^A'^.^.^^^^ are bounded away from zero and from infinity. 
That this is indeed the case will become obvious after introducing Se^^Xf) ^^'^ ^Cjomtcuif) given in (|35l) 
and (|36l) as it has been done in (d). Obviously, the argument of the log at the RHS of (l37l) is larger than 
zero and smaller than infinity on the interval / G [—0.5, 0.5]. Therefore, the integral on the RHS of (|37T ) 
exists, implying that also the LHS of (c) in (l37l) is bounded and, thus, that the eigenvalues of Ae Ag 
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are bounded away from zero and from infinity. Thus, in conclusion we have shown that Szego's theorem 
is applicable and that (a) in (|34l ) holds. 

The first term on the RHS of (|34l) is the mutual information rate in case of perfect channel state 
information, which for an average power constraint is maximized with i.i.d. zero-mean proper Gaussian 
data symbols. Thus, we get the following lower bound on the achievable rate with joint processing 

^S,{f) + 1 



n 



L joint 



L-1 
L 



Cperf(p) - / log 



df 



where Cperf(p) corresponds to the coherent capacity with 



Cperf(p) = E/i 



l0£ 



\h,r 



/ log {1 + pz) e'^'dz 

Jz=0 



(41) 



(42) 



and the factor (L — 1)/L arises as each L-th symbol is a pilot symbol. 



A. Lower Bound on the Achievable Rate for a Joint Processing of Data and Pilot Symbols and a Fixed 
Pilot Spacing 

Substituting (f42)) into ((4T)) we have found a lower bound on the achievable rate with joint processing 
of data and pilot symbols, for a given pilot spacing L and stationary Rayleigh flat-fading. 
For the special case of a rectangular PSEo of the channel fading process, i.e.. 




the lower bound in (HTT ) becomes 



n 



L joint 



L 



rect.5h(/) 



2 = 



for I/I < fd 
otherwise 



log(l+p2;)e ^dz-2fd\og 



p 



1 



(43) 



(44) 



B. Lower Bound on the Achievable Rate for a Joint Processing of Data and Pilot Symbols and an Optimal 
Pilot Spacing 

Obviously, the lower bound in (|44|) still depends on the pilot spacing L. In case the pilot spacing is not 
fixed, we can further enhance it by calculating the supremum of (l44l) with respect to L. In this regard, 
it has to be considered that the pilot spacing L is an integer value. Furthermore, we have to take into 
account that the derivation of the lower bound in (|44] ) is based on the assumption that the pilot spacing 
is chosen such that the channel fading process is at least sampled with Nyquist rate, i.e., (|7]) has to be 
fulfilled. In case the pilot spacing L is chosen larger than the Nyquist rate, the estimation error process 
is no longer stationary, which is required for our derivation. At this point it is also important to remark 
that periodically inserted pilot symbols do not maximize the achievable rate. For the special case of PSK 
signaling, it is shown in [fTTl that the use of a single pilot symbol, i.e., not periodically inserted pilot 
symbols, is optimal in the sense that it maximizes the achievable rate. However, in the present work we 
restrict to the assumption of periodically inserted pilot symbols with a pilot spacing fulfilling ([7]), which 
is customary and reasonable as this enables detection and decoding with manageable complexity. 

For these conditions, i.e., positive integer values for L fulfilling (|7]), it can be shown that the lower 
bound T^Ljoint 



rect.5h(/) 



in (1441) is maximized for 



^opt 



2fd 



(45) 



^Note that a rectangular PSD Sh{f ) corresponds to rh{l) ~ af-^sinc{2fdl) which is not absolutely summable. However, the rectangular 
PSD can be arbitrarily closely approximated by a PSD with a raised cosine shape, whose corresponding correlation function is absolutely 
summable. 
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To prove this statement we differentiate the RHS of (l44l) with respect to L and set the result equal to 
zero, which yields that the RHS of (l44l) has a unique local extremum at 



1 Cperf(p)P 
2/d p - Cperf (p) 



^opt = ^ ^^T':^ . (46) 



Numerical evaluation shows that the factor ^_^Q^^^I'p^ is larger than one. As (|46] ) is the only local extremum 
of the RHS of (l44l) . and with the constraints on L given by dV]) and the fact that L is an integer value, 
and considering that 7?.l joint monotonically increases with L for L < Lopt we can conclude that 

rect.S'^(/) 



the lower bound is maximized by Lopt in (I45I) . 

Substituting L in (|44|) by Lopt in (|?5l) yields a lower bound on the achievable rate with joint processing 
in case the pilot spacing can be arbitrarily chosen while fulfilling (|7]). 

V. Numerical Evaluation 

Fig. [T] shows a comparison of the bounds on the achievable rate for separate and joint processing of 
data and pilot symbols. 

On the one hand, the lower bound on the achievable rate for joint processing in (|44|) is compared to 
bounds on the achievable rate with separate processing of data and pilot symbols, i.e, (|26l ) and (1271 ), for 
a fixed pilot spacing. As the upper and lower bound on the achievable rate with separate processing are 
relatively tight, we choose the pilot spacing such that the lower bound on the achievable rate for separate 
processing in (|26l) is maximized. It can be seen that except for very high channel dynamics, i.e., very large 
fd the lower bound on the achievable rate for joint processing is larger than the bounds on the achievable 
rate with separate processing. This indicates the possible gain while using joint processing of data and 
pilot symbols for a given pilot spacing. Note, the observation that the lower bound for joint processing 
for large fd is smaller than the bounds on the achievable rate with separate processing is a result of the 
lower bounding, i.e., it indicates that the lower bound is not tight for these parameters. 

On the other hand, also the lower bound on the achievable rate with joint processing and a pilot 
spacing that maximizes this lower bound, i.e., (|44] ) in combination with (|45] ). is shown. In this case the 
pilot spacing is always chosen such that the channel fading process is sampled by the pilot symbols with 
Nyquist rate. Obviously, this lower bound is larger than or equal to the lower bound for joint processing 
while choosing the pilot spacing as it is optimal for separate processing of data and pilot symbols. This 
behavior arises from the effect that for separate processing in case of small fd a pilot rate is chosen that 
is higher than the Nyquist rate of the channel fading process to enhance the channel estimation quality. 
In case of a joint processing all symbols are used for channel estimation anyway. Therefore, a pilot rate 
higher than Nyquist rate always leads to an increased loss in the achievable rate as less symbols can be 
used for data transmission. 

Fig. [2] shows the lower bound on the achievable rate for joint processing of data and pilot symbols when 
choosing L as given in (|45] ), which maximizes the lower bound in (|44l) . This lower bound is compared 
to the following bounds on the achievable rate with i.i.d. zero-mean proper Gaussian (PG) input symbols 
for a rectangular PSD of the channel fading process, see (|43T ). which have been given in [fT4l 

X'^(y- x)|p^ = max \c,M - Vd\og (l + ^) ' o} (47) 
X^(y ; x) = min | log (1 + p) - 2/, j^^ log [l + e'^dz, | • (48) 



with Cperf(p) being the coherent capacity of a Rayleigh flat-fading channel given in (|42|) . 

Obviously, for some parameters the lower bound on the achievable rate for joint processing of data and 
pilot symbols is larger than the lower bound on the achievable rate with i.i.d. zero-mean proper Gaussian 
input symbols, i.e., without the assumption of any pilot symbols. However, this observation does not allow 
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Fig. 1 . Comparison of bounds on the achievable rate with separate processing of data and pilot symbols to lower bounds on the achievable 
rate with joint processing of data and pilot symbols; except of LB joint proc. L„pt the pilot spacing L is chosen such that the lower bound 
for separate processing ( 126) is maximized; the PSD Sh{f) is assumed to be rectangular, see ( 143) 



to argue that in these cases the use of pilot symbols is better than i.i.d. symbols, as we only compare 
lower bounds. 

VI. Summary 

In the present work, we have studied the achievable rate with a joint processing of pilot and data 
symbols in the context of stationary Rayleigh flat-fading channels. We have discussed the nature of the 
possible gain when using joint processing of data and pilot symbols in contrast to separate processing. 
We have shown that the additional information that can be retrieved by joint processing is contained in 
the temporal correlation of the channel estimation error process when using a solely pilot based channel 
estimation, which cannot be captured by standard decoders as they are used in conventional synchronized 
detection based receivers with a solely pilot based channel estimation. In addition, and this is the main 
novelty of the present work, we have derived a lower bound on the achievable rate for joint processing of 
data and pilot symbols on a stationary Rayleigh flat-fading channel, giving an indication on the possible 
gain in terms of the achievable rate when using a joint processing of pilot and data symbols in comparison 
to the typically used separate processing. 

Appendix A 

Minimization of /i'(ejoint|xD, xp) by CM Modulation 

In this appendix we will show that the differential entropy rate /i'(ejoint|x£), xp) in (|33l) , which depends 
on the distribution of the data symbols contained in xpi, is minimized for constant modulus input symbols 
among all distributions of the data symbols with an maximum average power of a^. 
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Fig. 2. Lower bound on the achievable rate with joint processing of data and pilot symbols and a pilot spacing Lopt that maximizes this 
bound, i.e., i44\ in combination with i45\ : for comparison bounds on the achievable rate with i.i.d. zero-mean proper Gaussian (PG) input 
symbols are shown; rectangular PSD Sh{,f), see l l43t 



The MAP channel estimate based on pilot and perfectly known data symbols is given by 

hjoint = argmaxp(h|y,x) 

= argmaxp(y|h, x)p(h) 

h 



with 



argmax {log(p(y|h, x)) +log(p(h))} (49) 

h 



1 / |y — XhP\ 
P(y|h.'') = ^exp(^ —j (50) 



hjoint = argmax |-^|y - Xhp - h^R^^^hj . (52) 

Differentiating the argument of the maximum operation at the RHS of (l52l) with respect to h and setting 
the result equal to zero yields 

- ^ {-X^y + X^Xh} -R^^h = (53) 



Thus, (|49l) becomes 



n 
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and, thusH 



ijoint 



x-V. 



(55) 



With (p5l) the channel estimation error correlation matrix Rej„i„, is given by 



h-hi 



^joint 



h-hi 



^joint 



-1 



Thus, the differential entropy /i(ejoint|x£), xp) becomes 



(56) 



/i(ejoint|xD,xp) = Ex logdet (vreR, 



log{(7re)^det(R, 



E. 



logdet (l^ - (R, + cr^(X^X) 



Rf 



(57) 



The argument of the expectation operation in the last summand on the RHS of (1571) can be rewritten as 



logdet (Iw-(r, + (t2(X^X)-i 



Ra 



logdet (l^-(l;v + R/:V^(X^X)-i) ' 



(a) 



logdet I 



N 



In - I ^R, + (X^X 



X^X 



,-1 



at, 



= - logdet (^^R;,X^X + Ijv^ 

where (a) follows from the matrix inversion lemma. Inserting (|58l) into (1571) yields 

/i(ejointlxz), xp) = log ((vre)^ det(R;,)) - E, logdet (^^R^X^X + I^^ . 

As the matrix X = diag(x) is diagonal, the product XX^ is also diagonal and its diagonal elements 
are the powers of the individual transmit symbols. In the following we substitute this product by 



(58) 



(59) 



XX' 



(60) 



and z = diag(Z) contains the diagonal elements of Z. 

The aim of this appendix is to show that the entropy rate /i'(ejoint|xz), xp) corresponding to the entropy 
in (|59| ) is minimized by constant modulus data symbols with the power cr^ among all input distributions 
fulfilling the maximum average power constraint in ©, i.e.. 



E 



x^x 



E 



N 



.k=l . 



< Nat 



(61) 



where the with A; = 1 ... are the elements of z. Therefor, in a first step, we study the entropy in (l59T l. 
i.e., a finite transmission length A^. Let the set V be the set containing all input distributions fulfilling 
the maximum average power constraint in (1611 . Note that this set V includes the case of having pilot 
symbols. However, when using pilot symbols, the transmit power of each L-th symbol is fixed to a^. For 

^Note that the inverse of X in J55t does not exist, if a diagonal element of the diagonal matrix X is zero, i.e., one transmit symbol has 
zero power. However, as the channel estimates can be rewritten as 

-1 



■*joint 



(54) 



it is obvious that the elements of hjoint are continuous in for all fc, and, thus, this does not lead to problems in the following derivation. 
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the moment, we allow all input distributions contained in V. Later on, we will come back to the special 
case of using pilot symbols. 

We want to find the input vector z that minimizes (l59l) provided that the average power constraint is 
fulfilled. Therefor, we first show that the argument of the expectation operation on the RHS of (|59| ), i.e.. 



g{Z) = logdet ( ^RhZ + ) (62) 



is concave in Z. To verify the concavity of 5'(Z), we follow along the lines of IfTSl Chapter 3.1.5] and 
consider an arbitrary line Z = Z + tA. Based on this, we define g{t) as 

g{t) = logdet i^Rh (Z + tA) + ) 



= logdet (^-^Rh^ + logdet (z + a^Rj;^ + tA 

log det (^-^Rh^ + log det (Q + tA) 
= logdet + logdet (qf (l^ + tQ-f AQ-^) Q^) 

= logdet 1^^^ + logdet (Q) + logdet (l^v + tQ"^ AQ~^ 
= logdet (^Rh^ + In] + logdet (1n + t (z + (x^R-i)"" A (z + a^R^i 



logdet (^RhZ + In]+Y1 log (1 + t^k) (63) 

where for (a) we have used the substitution Q = Z + cr^R^^ to simplify notation. Furthermore, the in 

(b) are the eigenvalues of (z + a^R^^)"' A (z + alRj;^)'^. 
Based on (l63l) the derivatives of g{t) with respect to t are given by 



^ = f (64) 
dt ^^1 + tXk 



k 



As the second derivative ^-^^ is always negative, g{7j) is concave on the set of diagonal matrices Z with 
non-negative diagonal entries. 

Based on the concavity of g{7j) with respect to Z we can lower-bound h{ejoint\'^D,^p) in (l59l) by 
using Jensen's inequality as follows, cf. (|62|) : 



/i(ejomt|xD,xp) = logdet ((vre)^ det(Rfe)) - [g{Z)] 

> log det ((vre)^ det(R,,)) - log det (^^/^E [Z] + I^v j . (66) 

Recall, that we want to show that constant modulus data symbols with the power a1 minimize the 
entropy rate /i'(ejoint|xD, xp). Therefore, from here on we consider the entropy rate which is given by 



/;,'(ejoint|xp,,xp) = lim ^/i(ejoint|xp,, xp) 



lim — 

Af-5-oo 



logdet ((7re)^det(R/,)) - logdet ( ^R,.E [Z] + 



(67) 
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In the next step, we show for which kind of distribution of z fulfilling the maximum average power 
constraint in (1611 the RHS of (|67] ) is minimized. I.e., we have to find 

lim 4 suplogdet ( ^R,.E [Z] + ) (68) 



where the set V contains all input distributions fulfilling the maximum average power constraint in (|6Tl ). 

For the evaluation of (1681 ) we substitute the Toeplitz matrix R/^ by an asymptotic equivalent circulant 
matrix C/i, which is possible, as we are finally interested in the supremum in (|68l) for the case of an 
infinite transmission length, i.e., — )■ oo. In the following, we will formalize the construction of Ch and 
show that the following holds 



hm — sup log det ^R/,E [Z] + I^v = lim — sup log det ^C/,E [Z] + I^v (69) 



1 

N-^io N "v" "° \o\'~"'" ' j N^'ca N "v" ""^ \cr'^ 

Therefore, we express the channel correlation matrix by its spectral decomposition 

R, = Rf ) = u(^)Af ) (U(^))'' (70) 

where we introduced the superscript (A^) to indicate the size of the matrices. Furthermore, the matrix 
U*-^'' is unitary and A^^^ = diag(Ai^'*, . . . , A^'') is diagonal and contains the eigenvalues A^.^'' of Rjj^'*. 

We construct the circulant matrix Cj^^'' which is asymptotically equivalent to the Toeplitz matrix R|/^'' 
following along the lines of [16, Section 4.4, Eq. (4.32)]. The first column of the circulant matrix C^^^ 
is given by {cq^\ Ci^\ . . . , c^n-i)'^ ^i^h the elements 



c 



k 



Here Sh{f) is the periodic continuation of Sh{f) given in dl]), i.e., 

oo 

Sh{f)= E 5if-k)^S,{f) (72) 

k = — OD 

and Sh{f) being zero outside the interval |/| < 0.5 for which it is defined in dU. The asterisk t^t in (1721) 
denotes convolution. 

As we assume that the autocorrelation function of the channel fading process is absolutely summable, 
see dl]), the PSD of the channel fading process Sh{f) is Riemann integrable, and it holds that 

lim lim l-y]sJl]e^'^^ 



SH{f)e^^^'^df = r,{k) (73) 



with rh{k) defined in 
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As the eigenvectors of a circulant matrix are given by a discrete Fourier transform (DFT), the eigenvalues 



Xf^ with k 



, of the circulant matrix C[^'' are given by 



N-l 

1=0 

N-l 



1 



N-l 



1=0 

N-l 



m=0 



1 



m 
N 

N-l 



N e 



i(m-(fc-l)) 
JV 



NJ N 



1=0 



Sh 



k-1 
N 



Consequently, the spectral decomposition of the circulant matrix C^^^ is given by 
where the matrix F^^^ is a unitary DFT matrix, i.e., its elements are given by 



1 ,-g_ (fc-l)(i-l) 



k,l 



N 



(74) 



(75) 



(76) 



Furthermore, the matrix A^^^^ is diagonal with the elements A^^^^ given in (1741 ). 

By this construction the circulant matrix C|/^^ is asymptotically equivalent to the Toeplitz matrix R|j^\ 
see [16, Lemma 4.6], if the autocorrelation function rh{k) is absolutely summable, which is assumed to 
be fulfilled, see ©. 

In the context of proving [fT6l Lemma 4.6], it is shown that the weak norm of the difference of Rjf^'' 
and C^j^^ converges to zero as — oo, i.e.. 



(TV) 



lim 

N^oo 







where the weak norm of a matrix B is defined as 



IBI ^ ( -Tr 



(77) 



(78) 



To exploit the asymptotic equivalence of rI/^^ and C|/^^ for the current problem, we have to show that 



This fact will be used later on. 

To exploit the asymptotic eqi 
the matrices in the argument of the logdet operation on the LHS and the RHS of (l69l) . i.e.. 



^R^E [Z] + 



C^E [Z] + 1 



N 



(79) 
(80) 



are asymptotically equivalent. 

In this context, we have to show that both matrices are bounded in the strong norm, and the weak norm 
of their difference converges to zero for N ^ oo [fT6l Section 2.3]. 

Concerning the condition with respect to the strong norm we have to show that 



(JV) 



K 



(iV) 



-RrE[Z]+I^ 



CrE[Z]+I 



N 



< OO 



< OO 



(81) 
(82) 
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with the strong norm of the matrix B defined by 

IIBII' 



max7fc 

k 



(83) 



where jk are the eigenvalues of the Hermitian nonnegative definite matrix BB^. The diagonal matrix 
E [Z] contains the average transmit powers of the individual transmit symbols on its diagonal. Thus, its 
entries are bounded. In addition, as the strong norms of rIj^"* and C|/^^ are bounded, too, the strong 
norms of K^^^ and Kg^^ are bounded. Concerning the boundedness of the eigenvalues of the Hermitian 
Toeplitz matrix see [|T6l Lemma 4.1]. 

Furthermore, the weak norm of the difference K^^^ 



K2^^ converges to zero for — )► cxd as 



(N) 



^Rf)E[Zl+I^-^Cr^EfZ- 



1 



erf, 



(T, 



W 1 



(N) 
h 



(N) 



E[Z] 
E[Z] 



where for (a) we have used [[T6l Lemma 2.3]. As ||E [Z] || is bounded, we get for — )■ cxo 



(84) 



lim 

Af-s>oo 



(N) 



< lim — 

Af— >oo (7^ 



(TV) 



|E[Z] 



(85) 



due to (1771) . Thus we have shown the asymptotic equivalence of K^^-* and Kg 

As K^^'' and Kg^^ are asymptotically equivalent, with [16, Theorem 2.4] the equality in (l69l) holds. For 
ease of notation, in the following we omit the use of the superscript (A^) for all matrices and eigenvalues. 
Based on (|69| ) the evaluation of the supremum in (1681) can be substituted by 



KTV) 



lim — sup log det 

N^oo N -p 



rC,.E [Z]+I 



N 



(a) 1 , . 

= lim — sup log det 

Af-*-oo A* -p 



fA/,f^e [z] + 1 



N 



lim \- sup log det ( ^ A;,F^E [Z] F + \A (86) 



where for (a) we have used (1751 ) and (b) is based on the following relation 

det (AB + I) = det (BA + I) 



(87) 



which holds as AB has the same eigenvalues as BA for A and B being square matrices [[T9l Theo- 
rem 1.3.20]. 

As the matrix ^A/^F^E [Z] F + I^r in the argument of the logarithm on the RHS of (|86l) is positive 
definite, using Hadamard's inequality we can upper-bound the argument of the supremum on the RHS of 
(f86l) as follows 



N 



log det — A.F^E [Z] F + < ^ log —A 



a. 



k=l 



F^E \Z] F 



k,k 



1 



(88) 



where 



F^E \Z] F 



are the diagonal entries of the matrix F E [Z] F. Note, this means that distributions 
of the input sequences z which lead to the case that the matrix F^E [Z] F is diagonal maximize the RHS 



k,k 
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of Using ([88]), the RHS of ^ is given by 

lim 4 sup log det ( ^ A/^F^E [Z] F + lA 

1 ^ / 1 ^ / 1 ^ 

= lim — sup log — Afc — E 



1 ^ / 1 - / 

lim — sup ^ log — Afc E 

Af-5-oo IS p ^ * * 



AT 



(89) 



It rests to evaluate the supremum on the RHS of (|89| ). However, as the logarithm is a monotonically 
increasing function with the maximum average power constraint in (|6T| ) the supremum in (|89l ) is given 
by 



lim ^ sup ^ log ( ^Afc ( E 



N 



N 



W 1^ = lim 4 E log ( ^Afc + l] 



= lim — log det 

N^oo N 



'al. 



lim llogdet(::fR, + l 



(90) 



where (a) is based on (1751 ) and for (b) we have used the asymptotic equivalence of the circulant matrix 
Ch and the Toeplitz matrix H^. 
Now, using diH), dSll), ((89]), and (HO]) the supremum in dM) is given by 

.2 \ 



1 



1 



lim — sup log det ^R/,E \Z] + I 



N 



lim — log det ( ^R;, + 

N^oo jy \ cr„ 



(91) 



However, this means that the entropy rate /;,'(ejoint|x£), xp) in (1671 ) is lower-bounded by 

^ log det ((ttc)^ det(R?,)) - log det [ ^R^E [Z] + I 



^'(ejoint|xz),Xp) 



lim 



Af-s>oo 



> lim — 

Af-s>oo 



log det { (vre)^ det(R,,) ) - log det ( ^^f R;, + I 



N 



0", 



lim — log det f vreRe 



(92) 



N-^oo N 

where for (a) we have used (|57T ) and (|58T ). and where Rejoi„,cM estimation error correlation matrix 

in case all input symbols have a constant modulus with power a^, cf. (|56l) 

, -1 



R, 



'^joint.CM 



N 



R/i 



(93) 



This mean, that the entropy rate /i'(ejoint|xpi, xp) is minimized for the given maximum average power 
constraint in (|6]) when all input symbols are constant modulus input symbols with power cr^. Note that 
this includes the case that each L-th symbol is a pilot symbol with power cr^ and all other symbols are 
constant modulus data symbols with power cr^. 

In conclusion, we have shown that the differential entropy rate /i'(ejoint|xp), xp) is minimized for constant 
modulus data symbols with power a^, i.e.. 



^'(ejoint|xD,Xp) > /l' (ejoint I XD,Xp 

= lim — log det f vreR, 

AT^OO N \ 



'Sjoint,CM 



(94) 
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Appendix B 

Estimation Error Spectra S^^^^if) and S^^^.^^^M) 

First, we calculate the PSD Se^^iif) of the channel estimation error in case of a solely pilot based 
channel estimation. The channel estimation error in the frequency domain is given by 



N 
k=l 



-j27Tfk 



(95) 



where Cpii ^ are the elements of the vector epji. In the following we are interested in the case N oo. As in 

in (1951 ) does not exist, in the following we discuss 



this 



case 



the sum 



limAT^oo jjEisiie^'^'^^), which can be expressed as follows 



lim l-E^{e^^-f) lim 1 



1=1 
L 



lim — 



Yat r,(p^'^''^-f) 
(7x 



= lim — 

N^oo N 



= lim — 

Af-s>oo 

= lim — 

7V-s>oo 



1=1 



,i27r/N 



L ■ W{e 



(96) 



For (a) we have used that the estimation error in frequency domain is the sum of the interpolation errors 
at the individual symbols time instances between the pilot symbols, where the temporal shift yields the 
phase shift of 27r//. Here Ej^^i^e^"^^^^) is the frequency transform of the estimation error at the symbol 
position with the distance / to the next pilot symbols, i.e.. 



EnA(^' 



JV 
L 



epii,(fc_i)L+i+; ■ e 

k=l 



^j2nfkL for / = 0, . . . , L - 1 



(97) 



where without loss of generality we assume that is an integer multiple of L and that the transmit 
sequence starts with a pilot symbol. Equality (b) results from expressing Ej\f^i{e^'^'^^^) by the difference 
between the actual channel realization and the estimated channel realization at the different interpolation 
positions in time domain transferred to frequency domain. Here, without loss of generality, we assume that 
the pilot symbols are given by a^-.. Furthermore, Wi{e^'^'^^^) is the transfer function of the interpolation 
filter for the symbols at distance / from the previous pilot symbol. Furthermore, lAr,p(e^^'^^^) is the channel 
output at the pilot symbols time instance transferred to frequency domain. For (c) we have used that the 
sum of the phase shifted channel realizations in frequency domain at sampling rate 1/L corresponds to 
the frequency domain representation of the fading process at symbol rate. In addition, we have used that 
for A^ — 7- oo the interpolation filter transfer function Wi{e^'^'^^^), which is an MMSE interpolation filter, 
can be expressed as 



W,(e 



(98) 



i.e., the interpolation filter transfer functions for the individual time shifts are equal except of a phase 
shift. Finally, for (d) we have expressed Y]^^p{e^'^'^^^) as the sum of the frequency domain representations 
of the fading process and the additive noise process. 
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Based on ^ the PSD „(/) is given by 



S.,{f) = lim -E \E^{e 



N^oo N 

lim — E 





2 











^1 ^1 

1=1 ^ 1=1 ^ 



(99) 



where for (a) we have used that Sh{f) is real and, thus, the MMSE filter W{e^^'^^^) is also real, see 
below. 

The MMSE filter transfer function W{e^'^''^f) is given by 



where we have used that 



(100) 



(101) 



Inserting (1 1001) into & yields 



^,(ei2-^/) + ^ 



at 



^ X 
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(102) 



where (a) results from (llOll i and for (b) we simplified the notation and substituted e-^^''^ by / to get a 
consistent notation with 

The PSD 5'ej„,„,cM(/) is then obviously given by setting L = 1 in (11021) . i.e., 

'S'ejo,n,,CM(/) = T 

p-^ + 1 

as all data symbols are assumed to be known and of constant modulus with power a^, cf. (I34l) . 
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